AITopics

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.61)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.61)

arXiv.org Artificial IntelligenceSep-16-2025

Task Decoding based on Eye Movements using Synthetic Data Augmentation

Sadhu, Shanmuka, Baran, Arca, Pandey, Preeti, Kumar, Ayush

Machine learning has been extensively used in various applications related to eye-tracking research. Understanding eye movement is one of the most significant subsets of eye-tracking research that reveals the scanning pattern of an individual. Researchers have thoroughly analyzed eye movement data to understand various eye-tracking applications, such as attention mechanisms, navigational behavior, task understanding, etc. The outcome of traditional machine learning algorithms used for decoding tasks based on eye movement data has received a mixed reaction to Yarbus' claim that it is possible to decode the observer's task from their eye movements. In this paper, to support the hypothesis by Yarbus, we are decoding tasks categories while generating synthetic data samples using well-known Synthetic Data Generators CTGAN and its variations such as CopulaGAN and Gretel AI Synthetic Data generators on available data from an in-person user study. Our results show that augmenting more eye movement data combined with additional synthetically generated improves classification accuracy even with traditional machine learning algorithms. We see a significant improvement in task decoding accuracy from 28.1% using Random Forest to 82% using Inception Time when five times more data is added in addition to the 320 real eye movement dataset sample. Our proposed framework outperforms all the available studies on this dataset because of the use of additional synthetic datasets. We validated our claim with various algorithms and combinations of real and synthetic data to show how decoding accuracy increases with the increase in the augmentation of generated data to real data.

algorithm, artificial intelligence, machine learning, (17 more...)

2509.11547

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology (0.47)
Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Neural Information Processing SystemsJan-22-2025, 00:34:19 GMT

Review for NeurIPS paper: Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains

Weaknesses: I would have liked to have seen more examples in the discussion of the topics that were detected. It would be helpful if, in Table 1 and other similar illustrations the different topics that the colored words correspond to where explicitly indicated. In the supplementary material the table showing topics (Table 4) is useful, but I am curious to understand more about the links between the works in each topic category. Regarding baselines, I realize in multimodal problems, especially those using modalities that are frequently not employed (e.g., eye tracking) it is difficult to find state of the art models that are appropriate. So this is not a major criticism but it does feel that perhaps the justification of the chosen baselines could be added to.

eye movement data, knowledge-rich domain, movement data and verbal narration, (3 more...)

Technology: Information Technology > Artificial Intelligence > Cognitive Science (0.73)

Neural Information Processing SystemsJan-22-2025, 00:34:12 GMT

Review for NeurIPS paper: Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains

This paper has a lot of content: Interesting cognitive science question of modelling human decision-making, data fusion of texts and eye movements, modelled with a new dynamic Bayesian nonparametric model, and introduces a new sampler for the model. This paper received a special amount of attention, 5 reviews which were needed because the paper makes several different kinds of contributions. Hence it is not a stereotypical good conference paper having one neat idea and presenting convincing theoretical or empirical support for it. Reviewers discussed the paper intensively, concluding that the paper is likely to be interesting at NeurIPS, and since there is not easy fix to make it more suitable to the format such as dividing it into two papers, it is good enough to be accepted though not among the best papers. Clarity can easily be improved by the authors, and additional details added in both the paper and the supplement.

eye movement data, knowledge-rich domain, movement data and verbal narration, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.75)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Neural Information Processing SystemsOct-9-2024, 14:52:35 GMT

Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains

eye movement data, knowledge-rich domain, movement data and verbal narration, (4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.65)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.65)

Bolliger, Lena Sophia, Haller, Patrick, Cretton, Isabelle Caroline Rose, Reich, David Robert, Kew, Tannon, Jäger, Lena Ann

EMTeC: A Corpus of Eye Movements on Machine-Generated Texts

arXiv.org Artificial IntelligenceAug-8-2024

The Eye Movements on Machine-Generated Texts Corpus (EMTeC) is a naturalistic eye-movements-while-reading corpus of 107 native English speakers reading machine-generated texts. The texts are generated by three large language models using five different decoding strategies, and they fall into six different text type categories. EMTeC entails the eye movement data at all stages of pre-processing, i.e., the raw coordinate data sampled at 2000 Hz, the fixation sequences, and the reading measures. It further provides both the original and a corrected version of the fixation sequences, accounting for vertical calibration drift. Moreover, the corpus includes the language models' internals that underlie the generation of the stimulus texts: the transition scores, the attention scores, and the hidden states. The stimuli are annotated for a range of linguistic features both at text and at word level. We anticipate EMTeC to be utilized for a variety of use cases such as, but not restricted to, the investigation of reading behavior on machine-generated text and the impact of different decoding strategies; reading behavior on different text types; the development of new pre-processing, data filtering, and drift correction algorithms; the cognitive interpretability and enhancement of language models; and the assessment of the predictive power of surprisal and entropy for human reading times. The data at all stages of pre-processing, the model internals, and the code to reproduce the stimulus generation, data pre-processing and analyses can be accessed via https://github.com/DiLi-Lab/EMTeC/.

computational linguistic, proceedings, retrieved, (15 more...)

2408.04289

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
(24 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
(2 more...)

arXiv.org Artificial IntelligenceDec-20-2022

Modeling Human Eye Movements with Neural Networks in a Maze-Solving Task

Li, Jason, Watters, Nicholas, Yingting, null, Wang, null, Sohn, Hansem, Jazayeri, Mehrdad

From smoothly pursuing moving objects to rapidly shifting gazes during visual search, humans employ a wide variety of eye movement strategies in different contexts. While eye movements provide a rich window into mental processes, building generative models of eye movements is notoriously difficult, and to date the computational objectives guiding eye movements remain largely a mystery. In this work, we tackled these problems in the context of a canonical spatial planning task, maze-solving. We collected eye movement data from human subjects and built deep generative models of eye movements using a novel differentiable architecture for gaze fixations and gaze shifts. We found that human eye movements are best predicted by a model that is optimized not to perform the task as efficiently as possible but instead to run an internal simulation of an object traversing the maze. This not only provides a generative model of eye movements in this task but also suggests a computational theory for how humans solve the task, namely that humans use mental simulation.

artificial intelligence, deep learning, machine learning, (19 more...)

2212.10367

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Netherlands > Drenthe > Assen (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

arXiv.org Artificial IntelligenceFeb-8-2022

Latent gaze information in highly dynamic decision-tasks

Hosp, Benedikt

Digitization is penetrating more and more areas of life. Tasks are increasingly being completed digitally, and are therefore not only fulfilled faster, more efficiently but also more purposefully and successfully. The rapid developments in the field of artificial intelligence in recent years have played a major role in this, as they brought up many helpful approaches to build on. At the same time, the eyes, their movements, and the meaning of these movements are being progressively researched. The combination of these developments has led to exciting approaches. In this dissertation, I present some of these approaches which I worked on during my Ph.D. First, I provide insight into the development of models that use artificial intelligence to connect eye movements with visual expertise. This is demonstrated for two domains or rather groups of people: athletes in decision-making actions and surgeons in arthroscopic procedures. The resulting models can be considered as digital diagnostic models for automatic expertise recognition. Furthermore, I show approaches that investigate the transferability of eye movement patterns to different expertise domains and subsequently, important aspects of techniques for generalization. Finally, I address the temporal detection of confusion based on eye movement data. The results suggest the use of the resulting model as a clock signal for possible digital assistance options in the training of young professionals. An interesting aspect of my research is that I was able to draw on very valuable data from DFB youth elite athletes as well as on long-standing experts in arthroscopy. In particular, the work with the DFB data attracted the interest of radio and print media, namely DeutschlandFunk Nova and SWR DasDing. All resulting articles presented here have been published in internationally renowned journals or at conferences.

maximum saccade peak velocity, minimum smooth pursuit dispersion, soccer goalkeeper expertise identification, (11 more...)

2202.04072

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.13)
North America > Canada > Ontario > Toronto (0.13)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)
Instructional Material (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)
(9 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(6 more...)

arXiv.org Machine LearningJul-29-2019

Task Classification Model for Visual Fixation, Exploration, and Search

Kumar, Ayush, Tyagi, Anjul, Burch, Michael, Weiskopf, Daniel, Mueller, Klaus

Yarbus' claim to decode the observer's task from eye movements has received mixed reactions. In this paper, we have supported the hypothesis that it is possible to decode the task. We conducted an exploratory analysis on the dataset by projecting features and data points into a scatter plot to visualize the nuance properties for each task. Following this analysis, we eliminated highly correlated features before training an SVM and Ada Boosting classifier to predict the tasks from this filtered eye movements data. We achieve an accuracy of 95.4% on this task classification problem and hence, support the hypothesis that task classification is possible from a user's eye movement data.

artificial intelligence, classifier, machine learning, (14 more...)

arXiv.org Machine Learning

doi: 10.1145/3314111.3323073

1907.12635

Country: North America > United States > New York (0.15)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.30)

González-Garduño, Ana V. (University of Copenhagen) | Søgaard, Anders (University of Copenhagen)

Learning to Predict Readability Using Eye-Movement Data From Natives and Learners

AAAI ConferencesFeb-8-2018

Readability assessment can improve the quality of assisting technologies aimed at language learners. Eye-tracking data has been used for both inducing and evaluating general-purpose NLP/AI models, and below we show that unsurprisingly, gaze data from language learners can also improve multi-task readability assessment models. This is unsurprising, since the gaze data records the reading difficulties ofthe learners. Unfortunately, eye-tracking data from language learners is often much harder to obtain than eye-tracking data from native speakers. We therefore compare the performance of deep learning readability models that use nativespeaker eye movement data to models using data from language learners. Somewhat surprisingly, we observe no significant drop in performance when replacing learners with natives, making approaches that rely on native speaker gaze information, more scalable. In other words, our finding is that language learner difficulties can be efficiently estimated from native speakers, which suggests that, more generally, readily available gaze data can be used to improve educational NLP/AI models targeted towards language learners.

learner, machine learning, natural language, (16 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)